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Pareto's law jlj states that the distribution of personal income 
obeys a power-law in the high-income range, and has been supported 
by international observations [§] — Q. Researchers have proposed mod- 



els []8| [14] over a century since its discovery. However, the dynamical 
nature of personal income has been little studied hitherto, mostly due 
to the lack of empirical work. Here we report the first such study, an 
examination of the fluctuations in personal income of about 80,000 
high-income taxpayers in Japan for two consecutive years, 1997 and 
1998, when the economy was relatively stable. We find that the dis- 
tribution of the growth rate in one year is independent of income in 
the previous year. This fact, combined with an approximate time- 
reversal symmetry, leads to the Pareto law, thereby explaining it as 
a consequence of a stable economy. We also derive a scaling relation 
between positive and negative growth rates, and show good agree- 
ment with the data. These findings provide the direct observation 
of the dynamical process of personal income flow not yet studied as 



much as for companies [15] [20| 



Flow and stock are the fundamental concepts in economics. They refer to a 
certain economic quantity in a given period of time and its accumulation at a 
point of time respectively. Personal income and wealth can be regarded as flow 
and stock observed at each individual in a giant dynamical network of people, 
which is open to various economic activities. The Italian social economist Vil- 
fredo Pareto 0, more than a century ago, studied the distribution of personal 
income and wealth in society as a characterization of a country's economic sta- 
tus. He found that the high-income distribution follows a power-law: the prob- 
ability that a given individual has income equal to, or greater than x, denoted 
by P>{x), obeys 

P > (x)<xx-^, (1) 

with a constant fi called Pareto index. This phenomenon, now known as a classic 
example of fractals, has been observed ||]-[^1 in many different countries, where 
/z varies typically around 2 reflecting economic conditions. 

Recent high-quality digitized data proves that the law holds for high-income 
range often with remarkable accuracy, and allows precise estimate of Pareto 
index over years. Fig. la shows the distribution of Japanese personal income 
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in the year 2000, derived from available data of the Japanese National Tax 
Administration (NTA) (corresponding to UK Board of Inland Revenue). Power- 
law behavior is a salient feature characterizing high income range nearly three 
orders of magnitude. 

Understanding the origin of the law has importance in economics because 
of linkage with consumption, business cycle and other macro-economic activ- 
ities, and also for practices in assessment of economic inequality [Q. Many 
researchers, recently including those in non-equilibrium statistical physics, have 
proposed models gJ-Q. Some theories were based on multiplicative stochastic 
processes. A classic theory by Gibrat J§| assumed that personal income de- 
pends on a number of causes each of which has a proportional effect that is 
independent of the proportional effects of the others, and also of initial income 
(law of proportionate effect). This theory, basically a random walk in logarith- 
mic scale of income, predicts log-normal distribution of income with Gaussian 
growth rate, both in disagreement with actual data for high income. One could 
introduce to the process a boundary effect that income should not be less than 
a value, and derived a power-law distribution |j| [l^]. Another approach is to 
construct a simple but minimal economic model in a network of wealth JT^ . 
Actually there have been proposed many kinds of scenarios |2^] which predict a 
power-law distribution as a static snapshot. However, in order to test models, 
it has been highly desirable to have direct observation of the dynamical process 
of growth and fluctuations of personal income. 

For that purpose, we employ Japanese income tax data which covers most 
of the power-law region in Fig. la. It is an exhaustive list of all taxpayers with 
full names, addresses and tax amounts, who paid 10 million yen or more in a 
year through tax offices of the Japanese National Tax Administration (NTA). 
The data were gathered from all the NTA offices. In Fig. lb, Pareto indices, 
estimated from such income tax data, since 1987 to 2000, are plotted, fj, changes 
annually around 2 with an abrupt jump between 1991 and 1992. Before the 
years, Japanese economy experienced abnormal rise of prices in the risky assets 
of lands and shares due to speculative investment ( "bubble" ) , after which those 
prices fell rapidly. We examined a relatively stable period in economy, namely 

1997 and 1998. The complete datasets of 93,394 persons in 1997 and 84,571 in 

1998 were used. Identification of individuals who are listed in both of the years 
were done if and only if his/her full name uniquely and exactly matches in both 
years with the same address (zip-code). Duplicate matches were only a few 
cases that were discarded. We assumed that the change of address and name is 
negligible in fraction. The number of the common set of those appearing in the 
two consecutive years was 52,902. The rest of persons in 1997 and 1998 can be 
therefore regarded as those disappearing from or novel in the list. 

The common set is shown by the scatter plot in Fig. 2, where each point 
represents a person who paid income tax of Ti in 1997 and T 2 in 1998 (both in 
units of thousand yen). This represents the joint distribution -Pi2(7i, T2). The 
plot is consistent with approximate time-reversal symmetry in the sense that 
the joint distribution is invariant under the exchange of the values T\ and Ti. 

Now the quantity of our concern is the annual change of individual income- 
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tax, or growth. Growth rate is defined as R — T 2 jT\. It is customary to use the 
logarithm of R, r = log 10 R. We examine the probability density for the growth 
rate P(r\T{) conditioned that the income T\ in the initial year is fixed. The 
result is shown in Fig. 3. Here we divide the range of T\ into logarithmically 
equal bins as T x S [lO^ - 2 ^ 1 ), io 4 +°- 2 »] with n = 1, - - • , 5. For each bin, 
the probability density for r was calculated. As shown in the figure, different 
plots for n collapse onto each other. This fact means that the distribution for 
the growth rate r is statistically independent of the initial value of T\. In a 
mathematical notation, we found that 

P 1R (T 1 ,R)=P 1 (T 1 )P R (R), (2) 

where P\r is the joint distribution for T\ and R, P\ and Pr are the distributions 
for T\ and R respectively. 

This "universal" distribution for the growth rate has a skewed and heavy- 
tailed shape with a peak at R = 1. How is such a functional form consistent 
with the approximate time-reversal symmetry shown in Fig. 2? The answer to 
this question leads us to an important bridge from the fluctuations of growth 
rates to the Pareto law as follows. The time-reversal symmetry (Fig. 2) claims 
that P\2(Ti, T 2 ) = Pi2(^2) Ti). One can easily see that under the variable trans- 
formation from (Ti, T 2 ) to (T U R), the equality Pi R {T u R) = T X P(T U T 2 ) holds. 
This equality, together with the time-reversal symmetry and the statistical in- 
dependence of equation (Q), leads us to the relation: 

JMT 2 )/Pi(Ti) =RPr(R)/Pr(1/R). (3) 

The left-hand side is a function of T\ and T 2 , while the right-hand side is a 
function of the ratio R only. We can then conclude that the distribution P\ 
obeys a power-law: P\(x) oc x~^ +1 \ whose integral form gives the expression, 
equation (^). Thus the independence in the growth rate of the past value and 
the time-reversal symmetry requires the Pareto law. 

In addition, we have a scaling relation following immediately from the above 
relation (^) and equation (Q): 

P R (R) = R-^+V P R (1/R). (4) 

This equation relates the positive and negative growth rates through the Pareto 
index [i. In Fig. 3, we fitted Pr(R) for the region of positive growth r > with 
an analytic function, and then plotted its counter part for negative growth rate 
r < derived from the scaling relation, equation. (||). The result fits the data in 
the region quite satisfactorily. 

In summary, the statistical independence of growth rate, the approximate 
time-reversal symmetry and the power-law are consistent with each other. Ac- 
cording to a sample survey by NTA on income earners with total income ex- 
ceeding 50 million yen and on sources of earning, their sources are employment 
income, income from real estate, capital gains from lands and shares. In frac- 
tion of income amount, capital gains from risky assets considerably exceed than 
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other non-risky income sources. It would be expected that asymmetric behavior 
of price fluctuations in those risky assets and accompanying increase of high- 
income persons causes breakdown of time-reversal symmetry, which necessarily 
brings about the invalidity of Pareto's law. This was actually the case in the 
"bubble" phase of Japanese economy, during which the prices of risky assets, 
especially of lands, rise abnormally compared to their fundamental values. Fig. 4 
shows the cumulative distributions of income tax in 1991 (peak of speculative 
bubble) and 1992. One can observe that the 1991 data cannot be fitted by the 
Pareto's law in the entire range of high-income, compared to the 1992 data. 

Our finding in this work shall serve as an empirical test for models of personal 
income and wealth, where people make choice among assets with different risks 
and returns, with changing degrees of freedom. Personal income is not a single 
example of such systems but other systems comprised by economic agents |pi} 
including companies, institutions and nations might be worth being examined 
from a new look. Indeed comparison with and similar analysis in company 
growth, which has been studied extensively |ll|-[^0], would be an interesting 
subject, where the Zipf law (/i = 1) is observed. 
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Figure 1: Personal income in Japan, a. Cumulative probability distribution of 
personal income from low to high income range in the year 2000. A data-point 
represents the probability (vertical axis) that a person has income equal to or 
more than the income of the horizontal value. Three datasets available from the 
Japanese National Tax Administration (NTA) were used, (i) Income tax data 
(dots) is the exhaustive list of all taxpayers, about 80,000, who paid income 
tax of 10 million yen or more. Tax value was converted to income uniformly 
by the same proportionality following the previous work[^). (ii) Income data 
(squares), a coarsely tabulated data for all the persons, about 7,273,000, who 
filed tax return, (iii) Employment income data, a sample survey for the salaried 
workers in private enterprises, about 44,940,000. Under the Japanese taxation, 
all persons with income exceeding 20 million yen have obligation to file final 
declaration to the NTA in each year. Thus the dataset (ii) includes all the 
persons listed in (i), so we have a reliable profile in the high income range (> 
20 million yen). For lower income, upper-bound estimate (triangles) was given 
by overlapping the datasets (ii) and (iii) which was found relatively good||. 
b. Annual change of Pareto index [i from the year 1987 to 2000. The complete 
list of income tax data in each year was used. Excluding top 0.1 percent and 
bottom 10 percent, samples equally spaced in logarithm of rank were plotted, 
from which slopes were estimated by least-square-fit. Error bars shown are 
standard error (90% level) of the estimate fi (dots). 
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Figure 2: Scatter-plot of all the individuals whose income tax exceeds 10 million 
yen both in the years 1997 and 1998. These points (52,902) were identified from 
the complete list of high-income taxpayers in 1997 (93,394) and in 1998 (84,571) 
(numbers in parentheses), with income taxes T\ and T2 in each year. A few 
points with T\ and/or T 2 exceeding 10 4 exist but are not shown here. 
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log 10 (Growth rate) 

Figure 3: Probability density P(r\T\) of the growth rate r = log 10 (T2/Ti) 
from year 1997 to 1998. Note that due to the limit T\ > 10 4 (in thousand 
yen), the data for large negative growth, r < 4 — log 10 Ti, are not available. 
Different bins of initial income-tax with equal size in logarithmic scale were 
taken as T x 6 [io 4 +°- 2 («- 1 ), io 4 + - 2 "] ( n = 1, • • • , 5) to plot probability densities 
separately for each such bins. All the densities collapse upon a same curve. This 
fact means that P(r\T\) does not depend on T\. The solid line in the portion 
of positive growth (r > 0) is an analytic fit. The dashed line (r < 0), on the 
other side, is calculated from the fit by the predicted relation given in equation 
|], which follows from the statistical independence shown here and approximate 
time-reversal symmetry. The predicted density of negative growth fits quite well 
with the actual data. 
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Figure 4: Cumulative probability distributions of income tax in 1991 and 1992. 
The Pareto index for 1992 data was estimated by excluding top 0.1 percent and 
bottom 10 percent, sampling equally in logarithmic scale, and estimating by 
least-squarc-fit, which is the fitted line (/z = 2.057 ± 0.005). 
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